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^_^ < Abstract 

("^ ' A CSP with n variables ranging over a domain of d values can be solved by brute-force in d" steps (omitting a 

OA ■ polynomial factor). With a more careful approach, this trivial upper bound can be improved for certain natural 

5_j ' restrictions of the CSP. In this paper we establish theoretical limits to such improvements, and draw a detailed 

^1^1 landscape of the subexponential-time complexity of CSP. 

•^r ■ We first establish relations between the subexponential-time complexity of CSP and that of other problems, 

including CNF-Sat. We exploit this connection to provide tight characterizations of the subexponential-time 
complexity of CSP under common assumptions in complexity theory. For several natural CSP parameters, we 
obtain threshold functions that precisely dictate the subexponential-time complexity of CSP with respect to the 
r ) ' parameters under consideration. 

Our analysis provides fundamental results indicating whether and when one can significantly improve on the 
brute-force search approach for solving CSP. 
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1 Introduction 



K^ ' The Constraint Satisfaction Problems (CSP) provides a general and uniform framework for the representation 



and solution of hard combinatorial problems that arise in various areas of Artificial Intelligence and Computer 

Science I jRossi et al., 2006 J . For instance, in database theory, the CSP is equivalent to the evaluation problem of 

conjunctive queries on relational databases [Gottlob et al, 2002) . 

■ ■ It is well known that CSP is NP-hard, as it entails fundamental NP-hard problems such as 3-COLORABlLlTY 

^—v \ and 3-CNF-Sat. Hence, we cannot hope for a polynomial-time algorithm for CSP. On the other hand, CSP 

fT^ ' can obviously be solved in exponential time: by simply trying all possible instantiations of the variables, we 

can solve a CSP instance consisting of n variables that range over a domain of d values in time d" (omitting a 

polynomial factor in the input size). Significant work has been concerned with improving this trivial upper bound 

^ ypeder and Motwani, 2002[|Beigel and Eppstein, 2005[|Grandoni and Italiano, 2006] , in particular, for certain re- 

/\ • strictions of CSP. For instance, binary CSP with domain size d can now be solved in time {d ~ 1)" (omitting 

S I a polynomial factor in the input size) by a forward-checking algorithm employing a fail-first variable ordering 



heuristic iRazgon, 2006|. All these improvements over the trivial brute-force search give exponential running 



times in which the exponent is linear in n. 

The aim of this paper is to investigate the theoretical limits of such improvements. More precisely, we 
explore whether the exponential factor d" can be reduced to a subexponential factor d"*^"' or not, considering 
various natural NP-hard restrictions of the CSP. We note that the study of the existence of subexponential-time 
algorithms is of prime interest, as a subexponential-time algorithm for a problem would allow us to solve larger 
hard instances of the problem in comparison to an exponential-time algorithm. 

Results We obtain lower and upper bounds and draw a detailed complexity landscape of CSP with respect to 
subexponential-time solvability. Our lower bounds are subject to (variants of) the Exponential Time Hypothesis 
(ETH), proposed by Impagliazzo and Paturi 11200 111 , which states that 3-CNF-Sat has no subexponential-time 
algorithm. 
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It is easy to see that CSP of bounded domain size (i.e., the maximum number of values for each variable) and 
bounded arity (i.e., the maximum number of variables that appear together in a constraint) has a subexponential- 
time algorithm if and only if the ETH fails. Our first result provides evidence that when we drop the bound on 
the domain size or the bound on the arity, the problem becomes "harder" (we refer to the discussion preceding 
Proposition |2]l: 

1. If Boolean CSP is solvable in nonuniform subexponential time then so is (unrestricted) CNF-Sat. 

2. If 2-CSP (all constraints have arity 2) is solvable in subexponential time then CLIQUE is solvable in time 
]\[oik) (^jy jg j-jjg number of vertices and k is the clique-size). 

As it turns out, the number of tuples plays an important role in characterizing the subexponential time complexity 
of CSP. We show the following tight result: 

3. CSP is solvable in subexponential time for instances in which the number of tuples is o(n), and unless the 
ETH fails, is not solvable in subexponential time if the number of tuples in the instances is il{n). 

For Boolean CSP of linear size we can even derive an equivalence to the ETH: 

4. Boolean CSP for instances of size il(n) is solvable in subexponential time if and only if the ETH fails. 

Results 3 and 4 also hold if we consider the total number of tuples in the constraint relations instead of the input 
size. 

By a classical result of |Freuder, 1990) , CSP becomes easier if the instance has small treewidth. There are 
several ways of measuring the treewidth of a CSP instance, depending on the graph used to model the structure 
of the instance. The most common models are the primal graph and the incidence graph. The former has as 
vertices the variables of the CSP instance, and two variables are adjacent if they appear together in a constraint. 
The incidence graph is the bipartite graph on the variables and constraints, where a variable is incident to all the 
constraints in which it is involved. We show that the treewidth of these two graph models give rise to different 
subexponential- time complexities : 

5. CSP is solvable in subexponential time for instances whose primal treewidth is o{n), but is not solvable in 
subexponential time for instances whose primal treewidth is Vl{n), assuming the ETH. 

6. CSP is solvable in polynomial time for instances whose incidence treewidth is 0(1), but is not solvable in 
subexponential time for instances whose incidence treewidth is 0^(1 ) unless the ETH fails. 

Our tight results, summarized in the table at the end of this paper, provide strong theoretical evidence that some of 
the natural restrictions of CSP may be "harder than" fc-CNF-SAT — for which a subexponential-time algorithm 
would lead to the failure of the ETH. Hence, our results provide a new point of view of the relationship between 
SAT and CSP, an important topic of recent AI research |Jeavons and Petke, 2012[ Dimopoulos and Stergiou, 2006 
[Benhamou et ai, 2012,,Bennaceur, 2004j . 

2 Preliminaries 

2.1 Constraint satisfiability and CNF-satisfiability 

An instance / of the CONSTRAINT SATISFACTION PROBLEM (or CSP, for short) is a triple {V, D,C), where V 
is a finite set of variables, _D is a finite set of domain values, and C is a finite set of constraints. Each constraint 
in C is a pair [S, R), where S, the constraint scope, is a non-empty sequence of distinct variables of V, and R, 
the constraint relation, is a relation over D whose arity matches the length of S; a relation is considered as a set 
of tuples. Therefore, the size of a CSP instance / = (V, _D, C) is the sum X](s m<^c 1*^1 ' 1^1' '■^^ total number of 
tuples is "Y^ig j^\fzc \R\- We assume, without loss of generality, that every variable occurs in at least one constraint 
scope and every domain element occurs in at least one constraint relation. Consequently, the size of an instance 
/ is at least as large as the number of variables in /. We write var{C) for the set of variables that occur in the 
scope of constraint C. 

An assignment or instantiation is a mapping from the set V of variables to the domain D. An assignment 
T satisfies a constraint C = {{xi, . . . , Xn),R) if {t{xi), . . . , T{xn)) £ R, and r satisfies the CSP instance if it 
satisfies all its constraints. An instance / is consistent or satisfiable if it is satisfied by some assignment. CSP is 



the problem of deciding whether a given instance of CSP is consistent. BOOLEAN CSP denotes the CSP with 
the Boolean domain {0, 1}. By r-CSP we denote the restriction of CSP to instances in which the arity of each 
constraint is at most r. 

For an instance / = {V, D, C) of CSP we define the following basic parameters: 

• vars: the number \V\ of variables, usually denoted by n; 

• size: the size of the CSP instance; 

• dom: the number \D\ of values; 

• cons: the number \C\ of constraints; 

CNF-Sat is the satisfiability problem forpropositional formulas in conjunctive normal form (CNF). fc-CNF-SAT 
denotes CNF-Sat restricted to formulas where each clause is of width at most k, i.e., contains at most k literals. 

2.2 Subexponential time 

The time complexity functions used in this paper are assumed to be proper complexity functions that are un- 
bounded and nondecreasing. The o() notation used denotes the o {■) notation ]Flum and Grohe, 2 006 1 . More 
formally, for any two computable functions /. g : N — > N, by writing /(n) = o{g{n)) we mean that there exists 
a computable nondecreasing unbounded function ^{n) : N ^ N, and no G N, such that f{n) < g{n)/ fji(n) for 
all n > no. 

It is clear that CSP and CNF-Sat are solvable in time dom" |/|'^(^) and 2" |/|'^(^) , respectively, where / is the 
input instance and n is the number of variables in /. We say that the CSP (resp. CNF-Sat) problem is solvable 
in uniform subexponential time if there exists an algorithm that solves the problem in time dom°'"' |/|*^'^^) (resp. 
2o(")|/|0(i)). Using the results of (Chen et al, 20091 |Flum and Grohe, 2006J , the above definition is equivalent 
to the following: The CSP (resp. CNF-Sat) problem is solvable in uniform subexponential time if there exists 
an algorithm that for all e = l/£, where ^ is a positive integer, solves the problem in time dom'^"|/|'^(^) (resp. 
2'^"|/|0(i))^ The CSP (resp. CNF-Sat) problem is solvable in nonuniform subexponential time if for each 
e — l/l, where £ is a positive integer, there exists an algorithm Ag that solves the problem in time dom^"|/|'^'^"^ 
(resp. 2^"|/|'^(^)) (that is, the algorithm depends on e). We note that subexponential-time algorithms running in 
0(2^) time do exist for many natural problems | |Alber et al., 200 41. 

Let Q and Q' be two problems, and let /i and p! be two parameter functions defined on instances of Q 
and Q' , respectively. In the case of CSP and CNF-Sat, p and p' will be the number of variables in the 
instances of these problems. A subexponential-time Turing reduction family pmpagliazzo et al, 200 1| (see 
also |Flum and Grohe, 2006[ ), shortly a serf-reductiorQ is an algorithm A with an oracle to Q' such that there 
are computable functions f,g:N — > N satisfying: (1) given a pair (/,e) where I € Q and e = 1/i (£ is a 
positive integer), A decides / in time /(l/e)dom^^^'^-'|/|°''^' (for CNF-Sat dom = 2); and (2) for all oracle 
queries of the form "/' G Q'" posed by A on input (/, e), we have fJ,'{I') < g{l/e){p{I) + log |/|). 

The optimization class SNP consists of all search problems expressible by second-order existential formulas 
whose first-order part is universal |Papadimitriou and Yannakakis, 199f). ImpagUazzo e/ a/. 112001 1 introduced 



the notion of completeness for the class SNP under serf-reductions, and identified a class of problems which are 
complete for SNP under serf -reductions, such that the subexponential-time solvability for any of these problems 
implies the subexponential-time solvability of all problems in SNP. Many well-known NP-hard problems are 
proved to be complete for SNP under the serf-reduction, including 3-Sat, Vertex Cover, and Independent 
Set, for which extensive efforts have been made in the last three decades to develop subexponential-time algo- 
rithms with no success. This fact has led to the exponential-time hypothesis, ETH, which is equivalent to the 
statement that not all SNP problems are solvable in subexponential-time: 



Exponential-Time Hypothesis (ETH): The problem /c-CNF-Sat, for any fc > 3, cannot be solved in time 
2°("), where n is the number of variables in the input formula. Therefore, there exists c > such that 
fc-CNF-SAT cannot be solved in time 2^^". 



' Serf-reductions were introduced by Impagliazzo et al. |ImpagIiazzo etai, 2001 1 . Here we use the definition given by Flum and 
Grohe |Flum and Grohe, 2006| . There is a slight difference between the two definitions, and the latter definition is more flexible for our 
purposes. 



The following result is implied from pmpagliazzo et al, 2001 Corollary 1 ] and from the proof of the Sparsi- 
fication Lemma llmpaghazzo et al., 2001), |Flum and Grohe, 2006| Lemma 16.17]. 



Lemma 1. fc-CNF-SAT (k > 3) is solvable in 2°("-* time if and only if fc-CNF-SAT with a linear number of 
clauses and in which the number of occurrences of each variable is upper bounded by a constant is solvable in 
time 2°("-*, where n is the number of variables in the formula (note that the size of an instance of fc-CNF-SAT is 
polynomial in n). 

The ETH has become a standard hypothesis in complexity theory j Lokshtanov etal. ,2011} . 

We close this section by mentioning some further work on the subexponential-time complexity of CSP. There 
are several results on 2-CSP with bounds on tw, the treewidth of the primal graph (see Section|5]for definitions). 
ILokshtanov et al.\ 120 1 1 II showed the following lower bound, using a result on list coloring I jFellows et al., 20lT| : 
2-CSP cannot be solved in time /(tw)n°(**) unless the ETH fails. IMarxl 1 120 lOal showed that if there is a 
recursively enumerable class Q of graphs with unbounded treewidth and a function / such that 2-CSP can be 
solved in time /(G')n°*^™/'°^™''' for instances whose primal graph is in Q, then the ETH fails. ITraxlerl 1 )200811 
studied the subexponential-time complexity of CSP where the constraints are represented by listing the forbidden 
tuples (in contrast to the standard representation that we use, where the allowed tuples are given, and which 
naturally captures database problems ]Gottlob et al, 2002] jGrohe, 2006[ Papadimitriou and Yannakakis, 1999)). 



This setting can be considered as a generalisation of CNF-Sat; a single clause gives rise to a constraint with 
exactly one forbidden tuple. 

3 Relations between CSP and CNF-Sat 

In this section, we investigate the relation between the subexponential-time complexity of CSP and that of CNF- 
Sat. A clause of constant width can be represented by a constraint of constant arity; the reverse holds as well 
(we get a constant number of clauses). Hence, we have: 

Proposition 1. Boolean r-CSP is solvable in subexponential time if and only if the ETH fails. 

The following proposition suggests that Proposition [T] may not extend to r-CSP with unbounded domain 
size. Chen et al. I jChen et al, 2005| showed that if CLIQUE (decide whether a given a graph on N vertices 
contains a complete subgraph of k vertices) is solvable in time A^°('') then the ETH fails. The converse, however, 
is generally believed not to be true. The idea behind the proof of the proposition goes back to the paper by 
Papadimitriou and Yannakakis II 19991 , where they used it in the context of studying the complexity of database 



queries. We skip the proof, and refer the reader to the original source jPapadimitriou and Yannakakis, 1999J . 
Proposition 2. If 2-CSP is solvable in subexponential time then CLIQUE is solvable in time N°^^' . 

We explore next the relation between BOOLEAN CSP with unbounded arity and CNF-Sat. We show that 
if Boolean CSP is solvable in nonuniform subexponential time then so is CNF-Sat. To do so, we exhibit a 
nonuniform subexponential-time Turing reduction from CNF-Sat to Boolean CSP. 

Intuitively, one would try to reduce an instance F of CNF-Sat to an instance / of CSP by associating with 
every clause in F a constraint in / whose variables are the variables in the clause, and whose relation consists of 
all tuples that satisfy the clause. There is a slight complication in such an attempted reduction because the number 
of tuples in a constraint could be exponential if the number of variables in the corresponding clause is linear (in 
the total number of variables). To overcome this subtlety, the idea is to first apply a subexponential-time (Turing) 
reduction, which is originally due to' Schuler1 l l2005l and was also used and analyzed bv lCalabro et a/.l i r200 6]. that 
reduces the instance F to subexponentially-many (in n) instances in which the width of each clause is at most 
some constant k; in our case, however, we will reduce the width to a suitable nonconstant value. We follow this 
reduction with the reduction to Boolean CSP described above. 

Theorem 1. // Boolean CSP has a nonuniform subexponential-time algorithm then so does CNF-Sat. 

Proof Suppose that BOOLEAN CSP is solvable in nonuniform subexponential time. Then for every 5 > Q, there 
exists an algorithm A'g that, given an instance / of BOOLEAN CSP with n' variables, A'g solves / in time 2''" |/|'^ , 
for some constant c' > 0. 

Let < e < 1 be given. We describe an algorithm A^ that solves CNF-Sat in time 2'^"r7i'^'^^\ Set 
k — [_2nTU)\- Let F be an instance of CNF-Sat with n variables and m clauses. The algorithm A^ is a 



search-tree algorithm, and works as follows. The algorithm picks a clause C in F of width more than k; if no 
such clause exists the algorithm stops. Let Zi, . . . , Z^ be any k literals in C. The algorithm branches on C into 
two branches. The first branch, referred to as a left branch, corresponds to one of these k literals being assigned 
the value 1 in the satisfying assignment sought, and in this case C is replaced in F by the clause (/i V ... V Ik), 
thus reducing the number of clauses in F of width more than fc by 1 . The second branch, referred to as a right 
branch, corresponds to assigning all those k literals the value in the satisfying assignment sought; in this case 
the values of the variables corresponding to those literals have been determined, and the variables can be removed 
from F and F gets updated accordingly. Therefore, in a right branch the number of variables in F is reduced 
by fc. The execution of the part of the algorithm described so far can be depicted by a binary search tree whose 
leaves correspond to instances resulting from F at the end of the branching, and in which each clause has width 
at most k. The running time of this part of the algorithm is proportional to the number of leaves in the search 
tree, or equivalently, the number of root-leaf paths in the search tree. Let F' be an instance resulting from F at 
a leaf of the search tree. We reduce F' to an instance Ip' of BOOLEAN CSP as follows. For each clause C" in 
F' , we correspond to it a constraint whose variable-set is the set of variables in C", and whose tuples consist of 
at most 2*^ — 1 tuples corresponding to all assignments to the variables in C" that satisfy C". Clearly, Ip: can be 
constructed in time 2''m'~^^-^^ (note that the number of clauses in F' is at most rn). To the instance Ipi, we apply 
the algorithm A'g with 6 — e/2. The algorithm A^ accepts F if and only if A'g accepts one of the instances Ipr, 
for some F' resulting from F at a leaf of the search tree. 

The running time of A^ is upper bounded by the number of leaves in the search tree, multiplied by a polyno- 
mial in the length of F (polynomial in m) corresponding to the (maximum) total running time along a root-leaf 
path in the search tree, multiplied by the time to construct the instance Ip/ corresponding to F' at a leaf of the 
tree, and multiplied by the running time of the algorithm A'g applied to Ip' . Note that the binary search tree 
depicting the execution of the algorithm is not a complete binary tree. To upper bound the size of the search tree, 
let P be a root-leaf path in the search tree, and let £ be the number of right branches along P. Since each right 
branch removes k variables, i < n/k and the number of variables left in the instance F' at the leaf endpoint of 
P is n — ik. Noting that the length of a path with £ right branches is at most m + £ (each left branch reduces m 
by 1 and hence there can be at most m such branches on P, and there are i right branches), we conclude that the 
number of root-leaf paths, and hence the number of leaves, in the search tree is at most X)i=o C"/ ) ■ 

The reduction from F' to an instance of BOOLEAN CSP can be carried out in time 2^'wP'^^\ and results in 
an instance Ip' in which the number of variables is at most n' = n — £k, the number of constraints is at most m, 
and the total size is at most 2'^m'-'^^\ Summing over all possible paths in the search tree, the running time of A^ 
is 2"'^mP'''^\ This is a consequence of the following estimation: 

< 2(i+^>+^"mO(i)r^"',^ (1) 

\\n/k\J 

< 2(i+'=')'=+''"m°(i) • (2m)"/''^ (2) 

< 2(i+"')'=+*'"mO(i) (3) 

< 2""m°(i^ 

The first inequality follows after replacing £ by the larger value \n/k~\ in the upper part of the binomial coefficient, 
and upper bounding the term 2^^*'^ by L Inequality ([T]) follows from the fact that the largest binomial coefficient 
in the summation is (™i ;"/n ) < (fn/fcl) '■"^ — T"-/^!' otherwise to is a constant, and the instance of CNF-Sat 
can be solved in polynomial time from the beginning), and hence, the summation can be replaced by the largest 
binomial coefficient multiplied by the number of terms ([?i/fc]H-l) in the summation, which gets absorbed by 
the term m'-"-^\ Inequality ^ follows from the trivial upper bound on the binomial coefficient (the ceiling can 
be removed because polynomials in m get absorbed). Inequality ^ follows after noting that n/k is a constant 
(depends on e), and after substituting fc and 6 by their values/bounds. 

It follows that the algorithm A^ solves CNF-Sat in time 2^'^m'-"-^\ Therefore, if BOOLEAN CSP has a 
nonuniform subexponential-time algorithm, then so does CNF-Sat. The algorithm is nonuniform because the 
polynomial factor in the running time (exponent of m) depends on e. D 



4 Instance size and number of tuples 

In this section we give characterizations of the subexponential-time complexity of CSP with respect to the in- 
stance size and the number of tuples. Recall that the size of an instance / = {V,D,C) of CSP is size = 
E(s B.)ec 1*^1 ■ 1^1- ^® ^^^° show that the subexponential-time solvability of BOOLEAN CSP with linear size, or 
linear number of tuples, is equivalent to the statement that the ETH fails. 

Lemma 2. Unless the ETH fails, BOOLEAN CSP is not solvable in subexponential-time if the instance size is 
0(n). 

Proof Let s{n) — Vl{n) > en be a complexity function, where c > is a constant. Suppose that the restriction 
of CSP to instances of size at most s{n) is solvable in subexponential time, and we will show that 3-CNF-Sat 
is solvable in subexponential time. By Lemma[Tl it is sufficient to show that 3-CNF-Sat with a linear number 
of clauses is solvable in 2°'^"^ time. Using a padding argument, we can prove the preceding statement assuming 
any linear upper bound on the number of clauses; we pick this linear upper bound to be cn/24, where c is the 
constant in the upper bound on s{n). 

Let F be an instance of 3-CNF-Sat with n variables and at most cri/24 clauses. We reduce F to an instance 
If of Boolean CSP using the same reduction described in the proof of Theorem[T] for each clause C of F we 
correspond a constraint whose variables are those in C and whose tuples are those corresponding to the satisfying 
assignments to C. Since the width of C is 3 and the number of clauses is at most cn/24, the instance Ip consists 
of at most cn/24 constraints, each containing at most 3 variables and 8 tuples. Therefore, the size of Ip is at 
most en. We now apply the hypothetical subexponential-time algorithm to Ip- Since |/| is linear in n, and since 
the reduction takes linear time in n, we conclude that 3-CNF-Sat is solvable in time 2°'^'^^ rf"^^^ = 2°*^"'. The 
proof follows. n 

Lemma 3. CSP restricted to instances with o(n) tuples is solvable in subexponential-time. 

Proof Let s{n) — o{n) be a complexity function, and consider the restriction of CSP to instances with at most 
sin) tuples. We will show that this problem is solvable in time dom*^"'|I|'^'^^^. Consider the algorithm A 
that, for each tuple in a constraint, branches on whether or not the tuple is satisfied by the satisfying assignment 
sought. A branch in which more than one tuple in any constraint is selected as satisfied is rejected, and likewise 
for a branch in which no tuple in a constraint is selected. For each remaining branch, the algorithm checks if the 
assignment to the variables stipulated by the branch is consistent. If it is, the algorithm accepts; the algorithm 
rejects if no branch corresponds to a consistent assignment. Clearly, the algorithm A is correct, and runs in time 

Noting that the number of tuples is a lower bound for the instance size, the following theorem follow from 
Lemma ID and Lemma [3] 

Theorem 2. CSP is solvable in subexponential-time for instances in which the number of tuples is o{n), and 
unless the ETH fails, is not solvable in subexponential-time if the number of tuples in the instances is VL{n). 

Next, we show that the subexponential-time solvability of Boolean CSP with linear size, or with linear 
number of tuples, is equivalent to the statement that the ETH fails. We first need the following lemma. 

Lemma 4. If the ETH fails then Boolean CSP with linear number of tuples is solvable in subexponential time. 

Proof We give a serf-reduction from BOOLEAN CSP with linear number of tuples to BOOLEAN r-CSP for some 
constant r > 3 to be specified below. The statement will then follow from Proposition[T] 

Let sin) < en be a complexity function, where e > is a constant. Consider the restriction of BOOLEAN 
CSP to instances in which the number of tuples is at most en; we will refer to this problem as BOOLEAN LINEAR 
Tuple CSP. Let < e < 1 be given. Choose a positive integer-constant d large enough so that the unique root 
of the polynomial x'^ — x'^^^ — 1 in the interval (1, cxo) is at most 2"^/^. (The uniqueness of the root was shown 
yChen etai, 200 1| Lemma 4.1], and the fact that the root converges to 1 as d — > oo can be easily verified.) Let 
/ be an instance of Boolean Linear Tuple CSP. We will assume that, for any constraint C in /, and any two 
variables x, y in C, there must be at least one tuple in C in which the values of x and y differ. If not, then the 
values of x and y in any assignment that makes / consistent have to be the same; in this case we remove all tuples 



from / in which the values of x and y differ, replace y with x in every constraint in /, and simplify / accordingly 
(if a constraint becomes empty during the above process then we reject /). 

We now apply the following branching procedure to /. For each constraint C in / with more than d tuples, 
pick a tuple t in C and branch on whether or not t is satisfied in an assignment that makes / consistent (if such 
an assignment exists). In the branch where t is satisfied, remove C from /, remove every tuple in / in which the 
value of a variable that appears in C does not conform to the value of the variable in t, and finally remove all 
variables in C from / and its tuples (if a constraint becomes empty reject /). In the branch where t is not satisfied, 
remove t from C. Note that each branch either removes a tuple or removes at least d tuples. We repeat the above 
branching until each constraint in the resulting instance contains at most d tuples. The above branching can be 
depicted by a binary search tree whose leaves correspond to all the possible outcomes from the above branching. 
The number of the leaves in the search tree is 0{xq^), where xq is the root of the polynomial x'' — x'^^^ — 1 in 
the interval (1, cx)). (The branching vector is not worse than (1, d).) By the choice of d, the number of leaves 
in the search tree is 0(2^"). Let /' be the resulting instance at a leaf of the search tree. We claim that the arity 
of /' is at most 2'K Suppose not, and let C be a constraint in /' whose arity is more than 2'^. Pick an arbitrary 
ordering of the tuples in C, and list them as ti , . . . , t^, where s < d. For each variable in C, we associate a binary 
sequence of length s whose ith bit is the value of the variable in t^. Since the arity is more than 2'', the number 
of binary sequences is more than 2''. Since the length of each sequence is s < d, by the pigeon-hole principal, 
there exist two binary sequences that are identical. This contradicts our assumption that no constraint has two 
variables whose values are identical in all the tuples of the constraint. It follows that the instance /' is an instance 
of Boolean 2'^-CSP. Since the number of variables in /' is at most that of /, and the number of leaves in the 
search tree is 0(2'^"), we have a serf-reduction from BOOLEAN LINEAR Tuple CSP to BOOLEAN r-CSP for 
some constant r. D 

Lemma|2l combined with Lemma|4]after noting that the size is an upper bound on the number of tuples, give 
the following result. 

Theorem 3. Boolean CSP with linear number of tuples is solvable in sub exponential time if and only if the 
ETHfails. 

Theorem 4. The Boolean CSP with linear size is solvable in sub exponential time if and only if the ETHfails. 

5 Treewidth and number of constraints 

In this section we characterize the subexponential-time complexity of CSP with respect to the treewidth of cer- 
tain graphs that model the interaction of variables and constraints. Many NP-hard problems on graphs become 
polynomial-time solvable for graphs whose treewidth is bounded by a constant. For a definition of treewidth 
we refer to other sources |Bodlaender, 1998| . IFreuderl II 19901! showed that CSP is polynomial-time solvable 
if a certain graph associated with the instance, the primal graph, is of bounded treewidth. The primal graph 
associated with a CSP instance / has the variables in / as its vertices; two variables are joined by an edge 
if and only if they occur together in the scope of a constraint. Freuder's result was generalized in various 
ways, and other restrictions on the graph structure of CSP instances have been considered I jGottlob et al, 2000{ 
jMarx, 2010b| . If the treewidth of the primal graph is bounded, then so is the arity of the constraints. The incidence 
graph provides a more general graph model, as it includes instances of unbounded arity even if the treewidth is 
bounded. The incidence graph associated with / is a bipartite graph with one partition being the set of variables 
in / and the other partition being the set of constraints in /; a variable and a constraint are joined by an edge if 
and only if the variable occurs in the scope of the constraint. For a CSP instance, we denote by tw the treewidth 
of its primal graph and by tw* the treewidth of its incidence graph. 

As shown by Bodlaenderl II 19961. there exists for every fixed k a linear time algorithm that checks if a graph 
has treewidth at most k and, if so, outputs a tree decomposition of minimum width. It follows that we can check 
whether the treewidth of a graph is 0(1) in polynomial time. 
Lemma 5. CSP is solvable in polynomial time for instances whose incidence treewidth tW* is 0(1). 

Proof If the tw* is 0(1) then the hype rtree -width is also 0(1) I jGottlobef a/., 2000| , and CSP is solvable in 
polynomial-time if the hypertree-width is 0(1) I jGottlob et al, 2002) . Combining the preceding statements gives 
the lemma. D 



Lemma 6. Unless the ETH fails, CSP is not solvable in subexponential-time if the number of constraints is w(l). 

Proof. Let A(n) — w(l) be a complexity function. We show that, unless the ETH fails, the restriction of CSP to 
instances in which COns < A(n), denoted CSPa is not solvable in dom°^"^ time. By PropositionlT] it suffices to 
provide a serf-reduction from BOOLEAN 3 -CSP with a linear number of constraints to BOOLEAN CSPa. 

Let /be an instance of BOOLEAN CSP in which COns = n' < en, where c > Ois a constant. Let Ci, . . . , C„' 
be the constraints in /; we partition these constraints arbitrarily into [A(n)J many groups Ci, . . . ,Cr, where 
r < [A(ri)J, each containing at most \n' /\{n)~\ constraints. The serf-reduction A works as follows. A "merges" 
all the constraints in each group Ci,i — 1, . . . ,r, into one constraint C'^ as follows. The variable-set of C^ consists 
of the union of the variable-sets of the constraints in Ci. For each constraint C in Ci, iterate over all tuples in C. 
After selecting a tuple from each constraint in Ci, check if all the selected tuples are consistent, and if so merge 
all these tuples into a single tuple and add it to C^. By merging the tuples we mean form a single tuple over the 
variables in these tuples, and in which the value of each variable is its value in the selected tuples (note that the 
values are consistent). Since each constraint in / has arity at most 3, and hence contains at most 8 tuples, and since 
each group contains at most \n' /\{n)~\ constraints, C^ can be constructed in time 8^" /-^(")T n"-'^-^^ — 2°^'^\ and 
hence, all the constraints C^, . . . , C^ can be constructed in time 2°'^"^n'^'^^^ — 2°^'^\ We now form the instance 
I' whose variable-set is that of /, and whose constraints are C(, . . . , C^. Since r < [A(n)J, /' is an instance of 
CSPa. Moreover, it is easy to see that / is consistent if and only if /' is. Since /' can be constructed from / in 
subexponential time and the number of variables in /' is at most that of /, it follows that A is a serf -reduction 
from Boolean 3-CSP with a Unear number of constraints to CSPa. □ 

Since tw* = O(cons), (removing the vertices corresponding to the constrains from the incidence graph 
results in an independent set) Lemma|5]and Lemma|6]give the following result. 

Theorem 5. CSP is solvable in polynomial time for instances with 0{\) constraints, and unless the ETH fails, 
is not solvable in subexponential-time if the number of constraints is w(l). 

Theorem 6. CSP is solvable in polynomial time for instances whose incidence treewidth tw* is 0{1), and unless 
the ETH fails, is not solvable in subexponential-time for instances whose tW* is lo{1). 

Theorem 7. CSP is solvable in subexponential-time for instances whose primal treewidth tw is o(n), and is 
not solvable in subexponential-time for instances whose tw is Q,{n) unless (the general) CSP is solvable in 
subexponential time. 

Proof The fact that CSP is solvable in subexponential time if tw = o(n) follows from the facts that: (1) we 
can compute a tree decomposition of width at most 4 • tw in time 2"' ■^^™|/|'-'(i) |Amir, 2010[ , and (2) CSP is 

solvable in time 0(dom**)|/|C(") jFreuder, 1990 J . 

Let s{n) = en, where c > is a constant, and consider the restriction of CSP to instances whose tw is at most 
s{n), denoted LiNEAR-tw-CSP. Note that the number of vertices in the primal graph is n, and hence tw < n. 
Therefore, if c > 1, then the statement trivially follows. Suppose now that c < 1, and let / be an instance of 
CSP with n variables. By "padding" [1/c] disjoint copies of / we obtain an instance /' that is equivalent to 
/, whose number of variables is N' = \l/c~\n, and whose tw is the same as that of /. Since the tw of / is at 
most n, it follows that the tw of /' is at most cN', and hence /' is an instance of LiNEAR-tw-CSP. This gives a 
serf-reduction from CSP to LiNEAR-tw-CSP. D 

We note that the hypothesis "CSP is solvable in subexponential time" in the above theorem implies that 
"ETH fails" by Proposition [T] and implies that CNF-Sat has a nonuniform subexponential-time algorithm by 
Theorem[T] We also note that the difference between the subexponential-time complexity of CSP with respect to 
the two structural parameters tw and tw* : Whereas the threshold function for the subexponential-time solvability 
of CSP with respect to tw is o{n), the threshold function with respect to tw* is 0(1). 

6 Degree and arity 

In this section we give characterizations of the subexponential-time complexity of CSP with respect to the degree 
and the arity. The proofs are omitted. 



Theorem 8. Unless ETH fails, CSP is not solvable in subexponential-time ;/deg > 2. 

Proof. The statement follows from the proof of Theorem [T] after noting that, by Lemma[T] one can use r-CNF- 
S AT with degree at most 3 (after introducing a linear number of new variables) in the reduction. This will result in 
instances of BOOLEAN r-CSP with degree at most 3 as well. Now for each variable x of degree 3 in an instance 
of Boolean r-CSP, we introduce two new variables a;', x", and add a constraint whose variables are {x, a;', x"}, 
and containing the two tuples (0, 0, 0) and (1, 1, 1); this constraint stipulates that the values of cc, x' , x" be the 
same. We then substitute the variable x in one of the constraints it appears in with x' , and in another constraint that 
it appears in with x" . Therefore, in the new instance, the degree of each of x, x' , x" becomes 2. After repeating 
this step to every variable of degree 3, we obtain an instance of BOOLEAN r-CSP in which the degree of each 
variable is at most 2. Since the increase in the number of variables is linear, a subexponential-time algorithm for 
Boolean r-CSP with degree at most 2 implies a subexponential-time algorithm for r-CNF-SAT. D 

As mentioned in Section [T] There is a folklore reduction from an instance of 3-COLORABlLlTY with n 
vertices that results in an instance of CSP with n variables, arity = 2, anddom = 3. Since the 3-COLORABlLlTY 
problem is SNP-complete under serf -reductions I Impagliazzo et al, 2001|, we get: 



Theorem 9. Unless ETH fails, CSP is not solvable in subexponential-time i/arity > 2 (and dom > 3j. 

Proof. We will show that a subexponential-time algorithm for CSP with arity = 2 and dom = 3 implies that 
the 3-COLORABlLlTY problem is solvable in subexponential time. Since the 3-COLORABlLlTY problem is SNP- 
complete under serf-reductions pmpaghazzo et al., 2001| , the statement of the theorem will follow. Recall that 
the 3-COLORABlLlTY problem asks if the vertices of a given graph can be colored with at most 3 colors so that 
no two adjacent vertices are assigned the same color 

The reduction is folklore. Given an instance of G = (V, E) of 3-COLORABlLlTY, we construct an instance / 
of CSP as follows. The variables of / correspond to the vertices of G, and the domain of / corresponds to the 
color-set {1, 2, 3}. For every edge of the graph we construct a constraint of arity = 2 over the two variables 
corresponding to the endpoint of the edge. The constraint contains all tuples corresponding to valid colorings of 
the endpoints of the edge. It is easy to see that G has a 3-coloring if and only if / is consistent. Since for the 
instance / we have vars = n, arity = 2, and dom — 3, an algorithm running in time dom°^"^ for CSP with 
arity = 2 and dom = 3 would imply a subexponential-time algorithm for 3-COLORABlLlTY. D 

We note that CSP with dom = 2 and arity = 2 is solvable in polynomial time via a simple reduction to 
2-CNF-Sat. 

7 Conclusion 

We have provided a first analysis of the subexponential-time complexity of CSP under various restrictions. We 
have obtained several tight thresholds that dictate the subexponential-time complexity of CSP. These tight results 
are summarized in the following table. 



CSP G SUBEXP CSP ^ SUBEXP Result 

(assuming the ETH) 

tuples G o{n) tuples e Q.{n) Theorem|2] 

cons G 0(1) (even in P) COns G a;(l) Theorem|5] 

tw* G Oil) (even in P) tw* G cj(l) Theorem|6] 

tw G o{n) tw G n(n) Theorem|7] 

Furthermore, we have linked the subexponential-time complexity of CSP with bounded arity to CLIQUE, and 
CSP with bounded domain size to CNF-Sat. These results suggest that these restrictions of CSP may be 
"harder than" /c-CNF-Sat — for which a subexponential-time algorithm would lead to the failure of the ETH — 
with respect to subexponential-time complexity. It would be interesting to provide stronger theoretical evidence 
for this separation. 
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